Influence of binary mask estimation errors on robust speaker identification

نویسنده

  • Tobias May
چکیده

Missing-data strategies have been developed to improve the noise-robustness of automatic speech recognition systems in adverse acoustic conditions. This is achieved by classifying time-frequency (T-F) units into reliable and unreliable components, as indicated by a so-called binary mask. Different approaches have been proposed to handle unreliable feature components, each with distinct advantages. The direct masking (DM) approach attenuates unreliable T-F units in the spectral domain, which allows the extraction of conventionally used mel-frequency cepstral coefficients (MFCCs). Instead of attenuating unreliable components in the feature extraction front-end, full marginalization (FM) discards unreliable feature components in the classification back-end. Finally, bounded marginalization (BM) can be used to combine the evidence from both reliable and unreliable feature components during classification. Since each of these approaches utilizes the knowledge about reliable and unreliable feature components in a different way, they will respond differently to estimation errors in the binary mask. The goal of this study was to identify the most effective strategy to exploit knowledge about reliable and unreliable feature components in the context of automatic speaker identification (SID). A systematic evaluation under ideal and nonideal conditions demonstrated that the robustness to errors in the binary mask varied substantially across the different missing-data strategies. Moreover, full and bounded marginalization showed complementary performances in stationary and non-stationary background noises and were subsequently combined using a simple score fusion. This approach consistently outperformed individual SID systems in all considered experimental conditions. © 2016 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/ ).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Two Stage Mask Estimation Approach to Robust Speaker Verification

We propose a two-stage mask estimation approach to robust speaker verification (SV) in noise environments. We consider a practical semi-blind SV scenario: the location of the target speaker is fixed while the locations of all interferers are unknown. In the first stage, we use a dual-microphone and a semi-blind degenerate unmixing estimation technique (DUET) to estimate an initial binary mask. ...

متن کامل

Sensorless Speed Control of Switched Reluctance Motor Drive Using the Binary Observer with Online Flux-Linkage Estimation

An adaptive online flux-linkage estimation method for the sensorless control of switched reluctance motor (SRM) drive is presented in this paper. Sensorless operation is achieved through a binary observer based algorithm. In order to avoid using the look up tables of motor characteristics, which makes the system, depends on motor parameters, an adaptive identification algorithm is used to estim...

متن کامل

Robust Identification of Smart Foam Using Set Mem-bership Estimation in A Model Error Modeling Frame-work

The aim of this paper is robust identification of smart foam, as an electroacoustic transducer, considering unmodeled dynamics due to nonlinearities in behaviour at low frequencies and measurement noise at high frequencies as existent uncertainties. Set membership estimation combined with model error modelling technique is used where the approach is based on worst case scenario with unknown but...

متن کامل

Saturated Neural Adaptive Robust Output Feedback Control of Robot Manipulators:An Experimental Comparative Study

In this study, an observer-based tracking controller is proposed and evaluatedexperimentally to solve the trajectory tracking problem of robotic manipulators with the torque saturationin the presence of model uncertainties and external disturbances. In comparison with the state-of-the-artobserver-based controllers in the literature, this paper introduces a saturated observer-based controllerbas...

متن کامل

Asr-driven Binary Mask Estimation for Robust Automatic Speech Recognition

Additive noise has long been an issue for robust automatic speech recognition (ASR) systems. One approach to noise robustness is the removal of noise information through segregation by binary time-frequency masks; each time-frequency unit in a spectro-temporal representation of the speech signal is labeled either noise-dominant or signal-dominant. The noise-dominant units are masked and their e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Speech Communication

دوره 87  شماره 

صفحات  -

تاریخ انتشار 2017